Data-driven foot-based intonation generator for text-to-speech synthesis

نویسندگان

  • Mahsa Sadat Elyasi Langarani
  • Jan P. H. van Santen
  • Seyed Hamidreza Mohammadi
  • Alexander Kain
چکیده

We propose a method for generating F0 contours for text-tospeech synthesis. Training speech is automatically annotated in terms of feet, with features indicating start and end times of syllables, foot position, and foot length. During training, we fit a foot-based superpositional intonation model comprising accent curves and phrase curves. During synthesis, the method searches for stored, fitted accent curves associated with feet that optimally match to-be-synthesized feet in the feature space, while minimizing differences between successive accent curve heights. We tested the proposed method against the HMMbased Speech Synthesis System (HTS) by imposing contours generated by these two methods onto natural speech, and obtaining quality ratings. Test sets varied in how well they were covered by the training data. Contours generated by the proposed method were preferred over HTS-generated contours, especially for poorly-covered test items. To test the new method’s usefulness for processing marked-up text input, we compared its ability to convey contrastive stress with that of natural speech recordings, and found no difference. We conclude that the new method holds promise for generating comparatively highquality F0 contours, especially when training data are sparse and when mark-up is required.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Foot-based Intonation for Text-to-Speech Synthesis using Neural Networks

We propose a method (“FONN”) for F0 contour generation for text-to-speech synthesis. Training speech is automatically segmented into left-headed feet, annotated with syllable start/end times, foot position in the sentence, and the number of syllables in the foot. During training, we fit a superpositional intonation model comprising accent curves associated with feet and phrase curves. We propos...

متن کامل

Structural Data-Driven Prosody Model for TTS Synthesis

This paper introduces a new data-driven prosody model for the text-to-speech system ARTIC. The model is intended to be almost language-independent and to generate naturally sounding intonation with a link to semantics. It is based on text parametrisation using a new prosodic grammar and on automatic speech corpora analysis methods. Its performance is evaluated by results of presented listening ...

متن کامل

Simulating Intonation in Regional Varieties of Swedish

Within the research project SIMULEKT (Simulating Intonational Varieties of Swedish), our recent work includes two approaches to simulating intonation in regional varieties of Swedish. The first involves a method for modelling intonation using the SWING (SWedish INtonation Generator) tool, where annotated speech samples are resynthesised with rule-based intonation and audio-visually analysed wit...

متن کامل

Synthesising intonational varieties of Swedish

Within the research project SIMULEKT (Simulating Intonational Varieties of Swedish), our recent work includes two approaches to simulating intonation in regional varieties of Swedish. The first involves a method for modeling intonation using the SWING (SWedish INtonation Generator) tool, where annotated speech samples are resynthesised with rule-based intonation and audio-visually analysed with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015